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Abstract 

This paper describes a forward algorithm and an adjoint algorithm for computing 
sensitivity derivatives in chaotic dynamical systems, such as the Lorenz attractor. 
The algorithms compute the derivative of long time averaged "statistical" quantities 
to infinitesimal perturbations of the system parameters. The algorithms are demon- 
strated on the Lorenz attractor. We show that sensitivity derivatives of statistical 
quantities can be accurately estimated using a single, short trajectory (over a time 
interval of 20) on the Lorenz attractor. 

Key words: Sensitivity analysis, linear response, adjoint equation, unsteady 
adjoint, chaos, statistical average, Lyapunov exponent, Lyapunov covariant vector, 
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1 Introduction 

Computational methods for sensitivity analysis is a powerful tool in modern 
computational science and engineering. These methods calculate the deriva- 
tives of output quantities with respect to input parameters in computational 
simulations. There are two types of algorithms for computing sensitivity deriva- 
tives: the forward algorithms and the adjoint algorithms. The forward algo- 
rithms arc more efficient for computing sensitivity derivatives of many output 
quantities to a few input parameters; the adjoint algorithms are more efficient 
for computing sensitivity derivatives of a few output quantities to many input 
parameters. Key application of computational methods for sensitivity analysis 
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include aerodynamic shape optimization [3], adaptive grid refinement [9], and 
data assimilation for weather forecasting [S]. 

In simulations of chaotic dynamical systems, such as turbulent flows and the 
climate system, many output quantities of interest are "statistical averages" . 
Denote the state of the dynamical system as x{t); for a function of the state 
J{x), the corresponding statistical quantity (J) is defined as an average of 
J{x{t)) over an infinitely long time interval: 



For ergodic dynamical systems, a statistical average only depends on the gov- 
erning dynamical system, and does not depend on the particular choice of 
trajectory x(t). 

Many statistical averages, such as the mean aerodynamic forces in turbulent 
flow simulations, and the mean global temperature in climate simulations, are 
of great scientific and engineering interest. Computing sensitivities of these 
statistical quantities to input parameters can be useful in many applications. 

The differentiability of these statistical averages to parameters of interest as 
been established through the recent developments in the Linear Response The- 
ory for dissipative chaos [n][Z|- A class of chaotic dynamical systems, known as 
"quasi-hyperbolic" systems, has been proven to have statistical quantities that 
are differentiable with respect to small perturbations. These systems include 
the Lorenz attractor, and possibly many systems of engineering interest, such 
as turbulent flows. 

Despite recent advances both in Linear Response Theory and in numerical 
methods for sensitivity computation of unsteady systems [TU] sensitivity 
computation of statistical quantities in chaotic dynamical systems remains 
difficult. A major challenge in computing sensitivities in chaotic dynamical 
systems is their sensitivity to initial condition, commonly known as the "but- 
terfly effect". The linearized equations, used both in forward and adjoint sen- 
sitivity computations, give rise to solutions that blow up exponentially. When 
a statistical quantity is approximated by a finite time average, the computed 
sensitivity derivative of the finite time average diverges to infinity, instead of 
converging to the sensitivity derivative of the statistical quantity J5\. Existing 
methods for computing correct sensitivity derivatives of statistical quantities 
usually involve averaging over a large number of ensemble calculations [5] [1] . 
The resulting high computation cost makes these methods not attractive in 
many applications. 

This paper outlines a computational method for efficiently estimating the 
sensitivity derivative of time averaged statistical quantities, relying on a single 
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trajectory over a small time interval. The key idea of our method, inversion 
of the "shadow" operator, is already used as a tool for proving structural 
stability of strange attractors [B]. The key strategy of our method, divide and 
conquer of the shadow operator, is inspired by recent advances in numerical 
computation of the Lyapunov covariant vectors [2]|llj. 

In the rest of this paper. Section |2] describes the "shadow" operator, on which 
our method is based. Section |3] derives the sensitivity analysis algorithm by 
inverting the shadow operator. Section |4] introduces a fix to the singularity 
of the shadow operator. Section [5] summarizes the forward sensitivity analysis 
algorithm. Section |6] derives the corresponding adjoint version of the sensitiv- 
ity analysis algorithm. Section [7] demonstrates both the forward and adjoint 
algorithms on the Lorenz attractor. Section |8] concludes this paper. 

The paper uses the following mathematical notation: Vector fields in the state 
space (e.g. /(x), (pi{x)) are column vectors; gradient of scalar fields (e.g. -7^) 
are row vectors; gradient of vector fields (e.g. are matrices with each row 
being a dimension of /, and each column being a dimension of x. The (■) sign 
is used to identify matrix-vector products or vector-vector inner products. For 
a trajectory x(t) satisfying ^ = /(x) and a scalar or vector field a{x) in the 
state space, we often use f to denote The chain rule f = ^-f = 

is often used without explanation. 



2 The "Shadow Operator 



For a smooth, uniformly bounded n dimensional vector field 6x{x), defined on 
the n dimensional state space of x. The following transform defines a slightly 
"distorted" coordinates of the state space: 

x'{x) = X + e 6x{x) (2) 

where e is a small real number. Note that for an infinitesimal e, the following 
relation holds: 

x'{x) — X = e6x{x) = e6x{x') + O(e^) (3) 

We call the transform from x to "shadow coordinate transform". In 

particular, consider a trajectory x{t) and the corresponding transformed tra- 
jectory x'(t) = x'{x(t)). For a small e, the transformed trajectory x'(t) would 
"shadow" the original trajectory x(t), i.e., it stays uniformly close to x(t) 
forever. Figure [T] shows an example of a trajectory and its shadow. 

Now consider a trajectory x{t) satisfying an ordinary differential equation 

X = fix) , (4) 
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Fig. 1. A trajectory of the Lorenz attractor under a shadow coordinate transform. 
The black trajectory shows x{t), and the red trajectory shows x'{t). The perturba- 
tion e6x shown corresponds to an infinitesimal change in the parameter r, and is 
explained in detail in Section [7j 

with a smooth vector field f{x) as a function of x. The same trajectory in the 
transformed "shadow" coordinates x'{t) do not satisfy the same differential 
equation. Instead, from Equation ([3]), we obtain 

06x 

x' = f{x) + e— — ■ f{x) 

dX 

ox ox 

In other words, the shadow trajectory x'{t) satisfies a slightly perturbed equa- 
tion 

x' = f{x') + eSfix') + Oie^) (6) 
where the perturbation 6f is 

df d5x (7) 



{Sf5x){ 



X] 



For a given differential equation x = f{x), Equation ([T]) defines a linear op- 
erator Sf : 6x ^ 6f. We call Sf the "Shadow Operator" of /. For any 
smooth vector field 6x{x) that defines a slightly distorted "shadow" coordi- 
nate system in the state space, Sj determines a unique smooth vector field 
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Sf{x) that defines a perturbation to the differential equation. Any trajectory 
of the original differential equation would satisfy the perturbed equation in 
the distorted coordinates. 



Given an ergodic dynamical system x = f{x), and a pair {6x, 6f) that satisfies 
6f = Sf6x, 6x determines the sensitivity of statistical quantities of the 
dynamical system to an infinitesimal perturbation eSf. Let J{x) be a smooth 
scalar function of the state, consider the statistical average (J) as defined in 
Equation ([T|. The sensitivity derivative of (J) to the infinitesimal perturbation 
e 6f is by definition 



d{J) 
de 



1 / 1 /"^ 1 ^ 

lim- lim - / J(x'(t))dt- lim - / J(x(t))dt 



where by the ergodicity assumption, the statistical average of the perturbed 
system can be computed by averaging over x'(t), which satisfies the perturbed 
governing differential equation. Continuing from Equation ([s]). 



de T^oo T Jo e 

1 fT J{x'{t)) - J{x{t)) 




= lim lim — / ' ' " ' ^ " dt (9) 

T^oo e^oT Jo e ^ ^ 

= lim — / ■ Sx dt = 

T^oo T Jo dx 

Equation ^ represents the sensitivity derivative of a statistical quantity (J) 
to the size of a perturbation e6f. There are two subtle points in : 

• The two limits lime_j.o and limT-^oo can commute with each other for the 
following reason: The two trajectories x'{t) and x{t) stay uniformly close 
to each other forever; therefore. 



J(x'(t)) - J(x(t)) dJ 



e dx 
uniformly for all t. Consequently, 



6x (10) 



1 rJiAt))-J(At))^^._^ir-0J,^^,^ 



T Jo e T Jo dx 

uniformly for all T. Thus the two limits commute. 

The two trajectories x'{t) and x{t) start at two specially positioned pair of 
initial conditions x'{0) = x{0) + eSx{x{0)). Almost any other pair of initial 
conditions (e.g. x'(0) = a;(0)) would make the two trajectories diverge as a 
result of the "butterfly effect" . They would not stay uniformly close to each 
other, and the limits lim^^o and limj-^oo would not commute. 
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Equation ^ represents the sensitivity derivative of the statistical quantity (J) 
to the infinitesimal perturbation eSf as another statistical quantity ■ Sx). 
We can compute it by averaging |^ • 5x over a sufficiently long trajectory, 
provided that 6x = S~^5f is known along the trajectory. The next section 
describes how to numerically compute 6x = S^^6f for a given 6f. 



3 Inverting the Shadow Operator 



Perturbations to input parameters can often be represented as perturbations 
to the dynamics. Consider a differential equation x = f{x, si, S2, ■ ■ ■ , Sm) pa- 
rameterized by m input variables, an infinitesimal perturbation in a input 
parameter Sj — )■ Sj + e can be represented as a perturbation to the dynamics 

Equation ^ defines the sensitivity derivative of the statistical quantity (J) to 
an infinitesimal perturbation e 6f, provided that a 6x can be found satisfying 
6f = Sf6x, where Sf is the shadow operator. To compute the sensitivity by 
evaluating Equation (|9|, one must first numerically invert Sf for a given Sf 
to find the corresponding 6x. 

The key ingredient of numerical inversion of Sf is the Lyapunov spectrum 
decomposition. This decomposition can be efficiently computed numerically 
[TT] |2]. In particular, we focus on the case when the system x = f{x) 
has distinct Lyapunov exponents. Denote the Lyapunov covariant vectors as 
4>i{x), (j)2{x), . . . , (pnix). Each 0j is a vector field in the state space satisfying 

j^um) = ^ ■ um) - A.0.(x(t)) (12) 

where Ai, A2, . . . , are the Lyapunov exponents in decreasing order. 

The Lyapunov spectrum decomposition enables a divide and conquer strategy 
for computing 5x = Sj^5f. For any 5f{x) and every point x on the attractor, 
both 6x{x) and Sf{x) can be decomposed into the Lyapunov covariant vector 
directions almost everywhere, i.e. 



Sx{x) = '^a^{x) (f>i{x) , (13) 
1=1 

n 

5fix) = J2a{ix)Mx), (14) 
1=1 

where and a{ are scalar fields in the state space. From the form of Sf in 
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Equation ([T]), we obtain 
df 



Sf(af 



d 



< (x) g ■ Ux) + Ux) + a^x) ^ 



(15) 



By substituting Equation (|12|) into the last term of Equation (15), we obtain 

(16) 



dt 



By combining Equation (16) with Equations (13), (14) and the hnear relation 
5f = Sf6x, we finally obtain 



daj 



i=l 



i=l 



(17) 



Equations (16) and (17) are useful for two reasons: 



(1) They indicate that the Shadow Operator Sf, applied to a scalar field 
a^{x) multiple of 0j(x), generates another scalar field a{{x) multiple of 
the same vector field (j)i{x). Therefore, one can compute Sj^5f by first 



decomposing 6f as in Equation (14) to obtain the a{ . If each can 
be calculated from the corresponding a{ , then 6x can be computed with 



Equation (13), completing the inversion. 



(2) It defines a scalar ordinary differential equation that governs the relation 
between and a{ along a trajectory x{t): 



da^lx) 
dt 



lUx) + Aj afix) 



This equation can be used to obtain from a{ along a trajectory, thereby 
filling the gap in the inversion procedure of Sf outlined above. For each 
positive Lyapunov exponent Aj, one can integrate the ordinary differential 
equation 

dcvf 



dt 



a. 



{ + Aj d^ 



(19) 



backwards in time from an arbitrary terminal condition, and the differ- 
ence between df{t) and the desired a^{x) will decrease exponentially. For 
each negative Lyapunov exponent Aj, Equation (19) can be integrated 



forward in time from an arbitrary initial condition, and d^{t) will con- 
verge exponentially to the desired a^(a;). For a zero Lyapunov exponent 
Ai = 0, Section |4] introduces a solution. 
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4 Time Dilation and Compression 



There is a fundamental problem in the inversion method derived in Section 
3j 5/ is not invertible for certain 6f. This can be shown with the following 
analysis: Any continuous time dynamical system with a non-trivial attractor 
must have a zero Lyapunov exponent A^q = 0. The corresponding Lyapunov 
covariant vector is (pnoi^) = fi^)- This can be verified by substituting Aj = 



and (f)i = f into Equation rtl2h. For this i = uq, Equations (19) becomes 



= (20) 



By taking an infinitely long time average on both sides of Equation (20), we 
obtain 

a/.W)^l..n«-"^>':<°'-'°>'^0, (21) 



Equation (21) implies that for any Sf = SfSx, the i = uq component of its 
Lyapunov decomposition (as in Equation (14)) must satisfy (a{g(a;)) = 0. Any 



6f that do not satisfy this linear relation, e.g. 6f = /, would not be in the 

/ 



range space of Sf. Thus the corresponding 5x = Sf ^5f does not exist 



Our solution to the problem is complementing Sf with a "global time dilation 
and compression" constant i], whose effect produces a 6f that is outside the 
range space of Sf. We call i] a time dilation constant for short. The combined 
effect of a time dilation constant and a shadow transform could produce all 
smooth perturbations 6f. 

The idea comes from the fact that for a constant rj, the time dilated or com- 
pressed system x = {l + eri)f{x) has exactly the same statistics (J), as defined 
in Equation ([T|, as the original system x = f{x). Therefore, the perturba- 
tion in any (J) due to any e6f is equal to the perturbation in (J) due to 
e {r]f{x) + 6f{x)). Therefore, the sensitivity derivative to Sf can be computed 
if we can find a 6x that satisfies Sf6x = f]f{x) + Sf{x) for some t]. 

We use the "free" constant r] to put rjf\x) + 5f\x) into the range space of Sf. 



By substituting r]f(x) + 6f{x) into the constraint Equation (21) that identifies 



the range space of Sf, the appropriate r] must satisfy the following equation 

V+{aij = 0, (22) 
which we use to numerically compute rj. 

Once the appropriate time dilation constant rj is computed, rjf{x) + 5f{x) 
is in the range space of Sf. We use the procedure in Section [3] to compute 
5x = Sf^{r]f + 5f), then use Equation ^ to compute the desired sensitivity 



8 



derivative d{J)/de. The addition of r]f to Sf affects Equation (19) only for 
i = no, making it 

^-^ = <(^) + V- (23) 



Equation (23) indicates that can be computed by integrating the right 



hand side along the trajectory. 



The solution to Equation (23) admits an arbitrary additive constant. The 



effect of this arbitrary constant is the following: By substituting Equations 



(13) into Equation (ph, the contribution from the i = uq term of Sx to d{J)/de 



IS 

lim ^ r al'^dt (24) 
T^oo T Jo dt ^ ^ 

Therefore, any constant addition to a^^^ vanishes as T — )■ oo. Computationally, 

however. Equation ^ must be approximated by a finite time average. We find 

it beneficial to adjust the level of to approximately {al^) = 0, in order to 

control the error due to finite time averaging. 



5 The Forward Sensitivity Analysis Algorithms 



For a given x = f{x), 6f and J{x), the mathematical developments in Sec- 
tions [3] and |4] are summarized into Algorithm [T] for computing the sensitivity 
derivative dS{J)/de as in Equation 



The preparation phase of the algorithm (Steps [Tjjs]) computes a trajectory and 
the Lyapunov spectrum decomposition along the trajectory. The algorithm 
then starts by decomposing df (Step|4]), followed by computing 5x (Steps [s]- 
[7|, and finally computing d{J) /de (Stepjsj). The sensitivity derivative of many 
different statistical quantities (Ji), (J2), ... to a single 5f can be computed by 
only repeating the last step of the algorithm. Therefore, this is a "forward" 
algorithm in the sense that it efficiently computes sensitivity of multiple output 
quantities to a single input parameter (the size of perturbation eSf). We will 
see that this is in sharp contrast to the "adjoint" algorithm described in Section 
[6} which efficiently computes the sensitivity derivative of one output statistical 
quantity (J) to many perturbations (5/i, 5/2, . . .. 

It is worth noting that the 5x computed using Algorithm [T] satisfies the forward 
tangent equation 

5x=y^-5x + r^f + 5f (25) 



This can be verified by taking derivative of Equation (13), substituting Equa- 



tions (19) and (23), then using Equation (14). However, 5x must satisfy both 
an initial condition and a terminal condition, making it difficult to solve with 
conventional time integration methods. In fact. Algorithm [T] is equivalent to 
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Algorithm 1 The Forward Sensitivity Analysis Algorithm 

(1) Choose a "spin-up buffer time" Tb, and an "statistical averaging time" 
Ta- Tb should be much longer than l/|Aj| for all nonzero Lyapunov ex- 
ponent Aj, so that the solutions of Equation ( 19 ) can reach over a time 
span of Tb- Ta should be much longer than the decorrelation time of the 
dynamics, so that one can accurately approximate a statistical quantity 
by averaging over [0,Ta]. 

(2) Obtain an initial condition on the attractor at t = —Tb, e.g., by solving 
X = f{x) for a sufficiently long time span, starting from an arbitrary 
initial condition. 

(3) Solve X = f{x) to obtain a trajectory x(t),t G [—Tb,Ta + Tb]; compute 
the Lyapunov exponents Aj and the Lyapunov covariant vectors (f)i{x(t)) 
along the trajectory, e.g., using algorithms in [11] and [2]. 

(4) Perform the Lyapunov spectrum decomposition of Sf(x ) a 



jectory x(t) to obtain a{{x),i = 1, . . . , n as in Equation (14) 



(5) Compute the global time dilation constant rj using Equation (22) 



ong the tra- 



(6) Solve the differential equations (19) to obtain af over the time interval 



[0,T^]. The equations with positive Aj are solved backward in time from 
t = Ta + Tb to t = 0] the ones with negative Aj are solved forward in 



time from t = —Tb to t = Ta- For A„q = 0, Equation (23) is integrated 



and the mean of a^, is set to zero. 



(7) Compute Sx along the trajectory x{t),t G [0,Ta] with Equation (13). 

(8) Compute d{J)/de using Equation ([T| by averaging over the time interval 
[0,Ta]- 

splitting 6x into stable, neutral and unstable components, corresponding to 



positive, zero and negative Lyapunov exponents; then solving Equation (25) 
separately for each component in different time directions. This alternative 
version of the forward sensitivity computation algorithm could be useful for 
large systems to avoid computation of all the Lyapunov covariant vectors. 



6 The Adjoint Sensitivity Analysis Algorithm 



The adjoint algorithm starts by trying to find an adjoint vector field /(x), 
such that the sensitivity derivative of the given statistical quantity (J) to any 
infinitesimal perturbation e 6f can be represented as 



(26) 



Both / in Equation (|26|) and |^ in Equation (|9|) can be decomposed into linear 



combinations of the adjoint Lyapunov covariant vectors almost everywhere on 
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the attractor: 



i=l 



dx 



(27) 
(28) 



i=l 



where the adjoint Lyapunov covariant vectors ipi satisfy 



(29) 



ip{x)'^ ■ (f){x) 



(30) 



With proper normahzation, the (primal) Lyapunov covariant vectors (pi and 
the adjoint Lyapunov covariant vectors ipi have the following conjugate rela- 
tion: 

1 , i = j 

i.e., the n x n matrix formed by all the 0j and the n x n matrix formed by all 
the ipi are the transposed inverse of each other at every point x in the state 
space. 

By substituting Equations (13) and (28) into Equation ([9]), and using the 
conjugate relation in Equation (30), we obtain 



de 



(31) 



1=1 



Similarly, by combining Equations (26), (14), (27) and (30), it can be shown 
that / satisfies Equation (26) if and only if 

d{J) 



de 



J2 i^i^i 



i=l 



(32) 



Comparing Equations (31) and (32) leads to the following conclusion: Equa- 
tion (26) can be satisfied by finding d{ that satisfy 



(d{a{) = (afa^ , i = l,...,n 



(33) 



and integrate by parts in time, we obtain 



The d{ that satisfies Equation ([33| can be found using the relation between 
a{ and af in Equation (18). By multiplying d{ on both sides of Equation (18) 



1 
T 



^ -^f f ,j. H «i 

al a{ dt = 
T 



1 
T 



'dd[ 
dt 



+ Aj d{ af dt 



(34) 
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for i ^ riQ. Through apply the same technique to Equation (23), we obtain for 

i = no 



T 



1 

T 



dt 



Vdijt 



(35) 



If we set df to satisfy the following relations 



dt 
dd{{x 



dt 



i 7^ no 
i = no 



(36) 



then Equations (|34|) and (|35|) become 

1 rT 



TJo 
1 rT 



d{a{dt 



"fx 



T 



+ 



T7o 



d{a{dt 



-fx 



T 



T7o 



df dt 



+ 



TJo 



d^a^ dt + r] 



TJo 



T 



L dt - {di) 



i ^ no 

i = Uq 
(37) 



As T — )■ oo, both equations reduces to Equation (33). 



In summary, if the scalar fields d{ satisfy Equation (36), then they also satisfy 
Equation (37) and thus Equation (33); as a result, the / formed by these 



d'f through Equation (27) satisfies Equation (26), thus is the desired adjoint 
vector field. 



For each i ^ uq, the scalar field d{ satisfying Equation (36) can be computed 
by solving an ordinary differential equations 



dd{ 
~dt 



d^ + Aj d{ . 



(3J 



Contrary to computation of through solving Equation (19), the time inte- 
gration should be forward in time for positive Aj, and backward in time for 
negative Aj, in order for the difference between d{{t) and d{{x{t)) to diminish 
exponentially. 



The i = riQ equation in Equation (|36|) can be directly integrated to obtain 



d^ (x). The equation is well defined because the right hand side is mean zero: 



T 



diSx{t))dt 



1 
T 



T dj 

dx 



dt = - r^dt'^-^O. (39) 
T Jo dt ^ ^ 



Therefore, the integral of d^^{x) over time, subtracted by its mean, is the 
solution d^jj(a;) to the i = Uq case of Equation (36). 
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Algorithm 2 The Adjoint Sensitivity Analysis Algorithm 

(1) Choose a "spin-up buffer time" Tb, and an "statistical averaging time" 
Ta- Tb should be much longer than l/|Aj| for all nonzero Lyapunov ex- 
ponent Aj, so that the solutions of Equation ( 19 ) can reach over a time 
span of Tb- Ta should be much longer than the decorrelation time of the 
dynamics, so that one can accurately approximate a statistical quantity 
by averaging over [0,Ta]. 

(2) Obtain an initial condition on the attractor at t = —Tb, e.g., by solving 
X = f{x) for a sufficiently long time span, starting from an arbitrary 
initial condition. 

(3) Solve X = f{x) to obtain a trajectory x(t),t G [—Tb,Ta + Tb]; compute 
the Lyapunov exponents Aj and the Lyapunov covariant vectors 

along the trajectory, e.g., using algorithms in [11] and [2]. 

(4) Perform the Lyapunov spectrum decomposition of (dJ/dx)^ along the 



trajectory x{t) to obtain a^{x{t))A = 1, . . . , n as in Equation (28). 
(5) Solve the differential equations (38) to obtain a{{x{t)) over the time in- 



terval [0, Ta\ - The equations with negative Aj are solved backward in time 
from i = T4 + Tg to t = 0; the ones with positive Aj are solved forward 
in time from t = —Tb to t = Ta- For i = uq, the scalar —a^^ is inte- 
grated along the trajectory; the mean of the integral is subtracted from 
the integral itself to obtain a-^ 



no' 



(6) Compute / along the trajectory x{t),t G [0,T4] with Equation (27). 

(7) Compute d{J) /de using Equation ( [26) ) by averaging over the time interval 

[0,Ta]. 

The above analysis summarizes to Algorithm [2] for computing the sensitivity 
derivative derivative of the statistical average (J) to an infinitesimal pertur- 
bations eSf. The preparation phase of the algorithm (Steps [l]|3]) is exactly the 
same as in Algorithm [l] These steps compute a trajectory x{t) and the Lya- 
punov spectrum decomposition along the trajectory. The adjoint algorithm 
then starts by decomposing the derivative vector (dJ/dx)^ (Step|4|, followed 
by computing the adjoint vector 5f (Steps [5|j6]), and finally computing d{J)/de 
for a particular 6f. Note that the sensitivity of the same (J) to many different 
perturbations 6/1,6/2, ■ ■ ■ can be computed by repeating only the last step of 
the algorithm. Therefore, this is an "adjoint" algorithm, in the sense that it 
efficiently computes the sensitivity derivatives of a single output quantity to 
many input perturbation. 



It is worth noting that / computed using Algorithm |2j satisfies the adjoint 
equation 

_/.^'./_^ (40) 

dx dx 



This can be verified by taking derivative of Equation (27), substituting Equa- 



tion (36), then using Equation (28). However, / must satisfy both an initial 



condition and a terminal condition, making it difficult to solve with conven- 
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tional time integration methods. In fact, Algorithm |2] is equivalent to splitting 
/ into stable, neutral and unstable components, corresponding to positive, 
zero and negative Lyapunov exponents; then solving Equation (40) separately 
for each component in different time directions. This alternative version of the 
adjoint sensitivity computation algorithm could be useful for large systems, 
to avoid computation of all the Lyapunov covariant vectors. 



7 An Example: the Lorenz Attractor 



We consider the Lorenz attractor x = f{x), where x = (xi, X2, Xs)"^, and 



fix) 



\ 



a[X2 — Xi 
Xi(r - 0:3) - X2 

y X1X2 - (3x3 J 



(41) 



The "classic" parameter values a = 10, r = 28, (3 = 8/3 are used. Both the 
forward sensitivity analysis algorithm (Algorithm [T]) and the adjoint sensitivity 
analysis algorithm (Algorithm [2]) are performed on this system. 

We want to demonstrate the computational efficiency of our algorithm; there- 
fore, we choose a relatively short statistical averaging interval of Ta = 10, and 
a spin up buffer period of Tb = 5. Only a single trajectory of length T4 + 2Tb 
on the attractor is required in our algorithm. Considering that the oscillation 
period of the Lorenz attractor is around 1, the combined trajectory length of 
20 is a reasonable time integration length for most simulations of chaotic dy- 
namical systems. In our example, we start the time integration from t = —10 
at X = (-8.67139571762,4.98065219709,25), and integrate the equation to 
t = —5, to ensure that the entire trajectory from — Tb to Ta + Tb is roughly 
on the attractor. The rest of the discussion in this section are focused on the 
trajectory x(t) for t G [—Tb, Ta + Tb]- 



7.1 Lyapunov covariant vectors 



The Lyapunov covariant vectors are computed in Step [3] of both Algorithm [T] 
and Algorithm [2| over the time interval [—Tb, Ta + Tb]- These vectors, along 
with the trajectory x{t), are shown in Figure |2} 

The three dimensional Lorenz attractor has three pairs of Lyapunov exponents 
and Lyapunov covariant vectors. Ai is the only positive Lyapunov exponent. 
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2 4 6 8 

(a) The state vector x 




2 4 6 8 10 

(b) First Lyapunov covariant vector (f)i 




2 4 6 8 10 

(c) Second Lyapunov covariant vector i 




2 4 6 8 10 

(d) Third Lyapunov covariant vector 03 



Fig. 2. The Lyapunov covariant vectors of the Lorenz attractor along the trajectory 
x{t) for t G [0, 10]. The x-axes are t; the blue, green and red lines correspond to the 
xi,X2 and 2:3 coordinates in the state space, respectively. 



and 01 is computed by integrating the tangent linear equation 



X 



dl 
dx 



■ X 



(42) 



forward in time from an arbitrary initial condition at t = — Tg. The first 
Lyapunov exponent is estimated to be Ai ~ 0.95 through a linear regression of 
X in the log space. The first Lyapunov vector is then obtained as 0i = x e 



-Alt 



A2 = is the vanishing Lyapunov exponent; therefore 
02 equal to 1. 



6 /(x), where 



2) is a normalizing constant that make the mean magnitude of 



The third Lyapunov exponent A3 is negative. So 03 is computed by integrating 



the tangent linear equation (42) backwards in time from an arbitrary initial 
condition at t = T4 + Tb. The third Lyapunov exponent is estimated to be 
A3 ~ —14.6 through a linear regression of the backward solution x in the log 
space. The third Lyapunov vector is then obtained as 



X e 
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1.2 Forward Sensitivity Analysis 



We demonstrate our forward sensitivity analysis algorithm by computing the 
sensitivity derivative of three statistical quantities (xl), {x"^) and, (x^) to a 
small perturbation in the system parameter r in the Lorenz attractor Equation 



(41 ). The infinitesimal perturbation r — )■ r + e is equivalent to the perturbation 



e6f = e^ = ei0,xi,0f 



(43) 




100 




-50 



-100 



(b) a{,i = 1,2,3 for the 5f 



Fig. 3. Lyapunov vector decomposition of 6f. The x-axes are t; the blue, green and 
red lines on the left are the first, second and third component of 5f as defined in 
Equation (43); the blue, green and red lines on the right are a{, and in the 



decomposition of 6f (Equation (14)), respectively. 



The forcing term defined in Equation (43) is plotted in Figure 3a Figure 3b 
plots the decomposition coefficients afTcomputed by solving a 3 x 3 linear 



system defined in Equation (14) at every point on the trajectory. 



For each a{ obtained through the decomposition, Equation (19) or (23) is 
solved to obtain af. For i = 1, Equation (19) is solved backwards in time 
from t = Ta + Tb to t = 0. For i 
estimated to he rj ■ 

i = 3, Equation (191) is solved forward in time from t = —Tb to t 



no 



2, the time compression constant is 
-2.78, and Equation (23) is integrated to obtain Og. For 

Ta- 



1,2,3 are plotted in Figure 4a, These values 



The resulting values of af,j 
are then substituted into Equation (13) to obtain Sx, as plotted in Figur e | 4b[ 
The "shadow" trajectory defined as x' = x + e5x is also plotted in Figure [l] as 
the red lines, for an e = 1/3. This Sx = Sj^5f is approximately the shadow 
coordinate perturbation "induced" by a 1/3 increase in the input parameter 
r, a.k.a. the Rayleigh number in the Lorenz attractor. 



The last step of the forward sensitivity analysis algorithm is computing the 
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(a) af,z 



4 6 8 

1,2,3 for the 5f 



10 




(b) 6x = 



i=l 



Fig. 4. Inversion of Sf for 6x = Sj ^Sf. The x-axes are t; the blue, green and red 
hnes on the left are af , af and af, respectively; the blue, green and red lines on the 



right are the first, second and third component of 6x, computed via Equation (13). 



sensitivity derivatives of the output statistical quantities using Equation (|9|. 
We found that using a windowed time averaging [1] yields more accurate 
sensitivities. Here our estimates over the time interval [0, Ta] are 



d{xl) 



2.64 



3.99 



d{x3) 



1.01 



(44) 



dr dr dr 

These sensitivity values compare well to results obtained through finite differ- 



ence, as shown in Section 7.4 



7.3 Adjoint Sensitivity Analysis 



We demonstrate our adjoint sensitivity analysis algorithm by computing the 
sensitivity derivatives of the statistical quantity (xa) to small perturbations in 



the three system parameters s, r and h in the Lorenz attractor Equation (41 ). 



The first three steps of Algorithm [2] is the same as in Algorithm [T| and has 



been demonstrated in Section 7.1 Step involves decomposing (dJ/dx) into 



three adjoint Lyapunov covariant vectors. In our case, J{x) = xs, therefore 
dJ/dx = (0,0,1), as plotted in Figure [5a} The adjoint Lyapunov covariant 



vectors ipi can be computed using Equation (30) by inverting the 3x3 matrix 
formed by the (primal) Lyapunov covariant vectors 0j at every point on the 
trajectory. The coefficients a^,z = 1,2,3 can then be computed by solving 
Equation (28). These scalar quantities along the trajectory are plotted in 



Figure 5b for t G [0,Ta]. 



Once we obtain d^, df can be com put ed by solving Equation (38). The solution 



is plotted in Figure 6a Equation (27) can then be used to combine the af into 



17 




Fig. 5. Adjoint Lyapunov vector decomposition of dJ /dx. The x-axes are i; the blue, 
green and red lines on the left are the first, second and third component of dJ/dx] 
the blue, green and red lines on the right are af , af and df in the decomposition of 
dJ/dx (Equation ([28])), respectively. 
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0.2 
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(a) al,i 



2 4 6 8 10 

1,2,3 solved using Equation 




(38) 



2 4 6 8 



10 



Fig. 6. Computation of the adjoint solution / for the Lorenz attractor. The x-axes 
are t; the blue, green and red lines on the left are d{, dg and dg, respectively; the 
blue, green and red lines on the right are the first, second and third component of 
/, computed via Equation (27). 



the adjoint vector /. The computed / along the trajectory is plotted both in 
Figure 6b as a function of t, and also in Figure [7] as arrows on the trajectory 
in the state space. 



The last step of the adjoint sensitivity analysis algorithm is computing the 
sensitivity derivatives of (J) to the perturbations 5fs = ^, = % and 
5fb = ^ using Equation (26). Here our estimates over the time interval [0, Ta\ 
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Fig. 7. The adjoint sensitivity derivative / as in Equation (26), represented by 
arrows on the trajectory. 



are computed as 

^^^0.21, ^^0.97, ^--1.74 (45) 
ds dr db 

The next section compares these sensitivity estimates, together with the sen- 
sitivity estimates computed in Section 7.2, to a finite difference study. 



7.4 Comparison with finite difference 



To reduce the noise in the computed statistical quantities in the finite differ- 
ence study, a very long time integration length of T = 100, 000 is used for 
each simulation. Despite this long time averaging, the quantities computed 
contain statistical noise of the order 0.01. The noise limits the step size of the 
finite difference sensitivity study. Fortunately all the output statistical quanti- 
ties seem fairly linear with respect to the input parameters, and a moderately 
large step size of the order 0.1 can be used. To further reduce the effect of 
statistical noise, we perform linear regressions through 10 simulations of the 
Lorenz attractor, with r equally spaced between 27.9 and 28.1. The total time 
integration length (excluding spin up time) is 1,000,000. The resulting com- 
putation cost is in sharp contrast to our method, which involves a trajectory 
of only length 20. 
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Similar analysis is performed for the parameters s and b, where 10 values of 
s equally spaced between 9.8 and 10.2 are used, and 10 values of b equally 
spaced between 8/3 — 0.02 and 8/3 + 0.02 are used. The slopes estimated from 
the linear regressions, together with 3a confidence intervals (where a is the 
standard error of the linear regression) is listed below: 



dr 
ds 



2.70 ±0.10 
0.16 ±0.02 



^^^'^-3.87 ±0.18, ^^^^^ 



dr 
djxs) 

db 



-1.68 ±0.15 . 



dr 



1.01 ±0.04 



(46) 




2.6 2.7 2.8 



(a) 



djxl) 

dr 




0.95 1.00 1.05 



(b) 



dr 



(c) 



d{x3) 
dr 



Fig. 8. Histogram of sensitivities computed using Algorithm [T] (forward sensitivity 
analysis) starting from 200 random initial conditions. Ta = 10, Tg = 5. The red re- 
gion identifies the Scr confidence interval estimated using finite difference regression. 
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dr 



1.1 




-2.7 -2.2 -1.7 -1.2 



(c) 



d{x3) 
db 



Fig. 9. Histogram of sensitivities computed using Algorithm [2] (adjoint sensitivity 
analysis) starting from 200 random initial conditions. Ta = 10, Tg = 5. The red re- 
gion identifies the Scr confidence interval estimated using finite difference regression. 

To further assess the accuracy of our algorithm, which involves finite time 
approximations to Equations ^ and (26), we repeated both Algorithm [l] and 
Algorithm |2] for 200 times, starting from random initial conditions at T = —10. 
We keep the statistical averaging time Ta = 10 and the spin up buffer time 



20 



Tb = 5. The resulting histogram of sensitivities computed with Algorithm [T] 
is shown in Figure |8] the histogram of sensitivities computed with Algorithm 
[2] is shown in Figure [9j The finite difference estimates are also indicated in 
these plots. 

We observe that our algorithms compute accurate sensitivities most of the 
time. However, some of the computed sensitivities seems to have heavy tails 
in their distribution. This may be due to behavior of the Lorenz attractor 
near the unstable fixed point (0,0,0). Similar heavy tailed distribution has 
been observed in other studies of the Lorenz attractor yy. They found that 
certain quantities computed on Lorenz attractor can have unbounded second 
moment. This could be the case in our sensitivity estimates. Despite this minor 
drawback, the sensitivities computed using our algorithm have good quality. 
Our algorithms are much more efficient than existing sensitivity computation 
methods using ensemble averages. 



8 Conclusion 



This paper derived a forward algorithm and an adjoint algorithm for com- 
puting sensitivity derivatives in chaotic dynamical systems. Both algorithms 
efficiently compute the derivative of statistical quantities (J) to infinitesimal 
perturbations e 6f to the dynamics. 



The forward algorithm starts from a given perturbation 6f, and computes 
a perturbed "shadow" coordinate system 6x, e.g. as shown in Figure [T] The 
sensitivity derivatives of multiple statistical quantities to the given 6f can be 
computed from 6x. The adjoint algorithm starts from a statistical quantity (J), 
and computes an adjoint vector /, e.g. as shown in Figure [?} The sensitivity 
derivative of the given (J) to multiple input perturbations can be computed 
from /. 



We demonstrated both the forward and adjoint algorithms on the Lorenz 
attractor at standard parameter values. The forward sensitivity analysis al- 
gorithm is used to simultaneously compute and the adjoint 
sensitivity analysis algorithm is used to simultaneously compute 
and ^^^§^- We show that using a single trajectory of length about 20, both 
algorithms can efficiently compute accurate estimates of all the sensitivity 
derivatives. 
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